48 research outputs found

    Identifying health status of wind turbines by using self organizing maps and interpretation-oriented post-processing tools

    Get PDF
    Identifying the health status of wind turbines becomes critical to reduce the impact of failures on generation costs (between 25–35%). This is a time-consuming task since a human expert has to explore turbines individually. Methods: To optimize this process, we present a strategy based on Self Organizing Maps, clustering and a further grouping of turbines based on the centroids of their SOM clusters, generating groups of turbines that have similar behavior for subsystem failure. The human expert can diagnose the wind farm health by the analysis of a small each group sample. By introducing post-processing tools like Class panel graphs and Traffic lights panels, the conceptualization of the clusters is enhanced, providing additional information of what kind of real scenarios the clusters point out contributing to a better diagnosis. Results: The proposed approach has been tested in real wind farms with different characteristics (number of wind turbines, manufacturers, power, type of sensors, ...) and compared with classical clustering. Conclusions: Experimental results show that the states healthy, unhealthy and intermediate have been detected. Besides, the operational modes identified for each wind turbine overcome those obtained with classical clustering techniques capturing the intrinsic stationarity of the data.Peer ReviewedPostprint (published version

    Decomposition methods for machine learning with small, incomplete or noisy datasets

    Get PDF
    In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.Fil: Caiafa, César Federico. Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Instituto Argentino de Radioastronomía. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto Argentino de Radioastronomía; ArgentinaFil: Sole Casals, Jordi. Center for Advanced Intelligence; JapónFil: Marti Puig, Pere. University of Catalonia; EspañaFil: Sun, Zhe. RIKEN; JapónFil: Tanaka,Toshihisa. Tokyo University of Agriculture and Technology; Japó

    Machine Learning Methods with Noisy, Incomplete or Small Datasets

    Get PDF
    In this article, we present a collection of fifteen novel contributions on machine learning methods with low-quality or imperfect datasets, which were accepted for publication in the special issue “Machine Learning Methods with Noisy, Incomplete or Small Datasets”, Applied Sciences (ISSN 2076-3417). These papers provide a variety of novel approaches to real-world machine learning problems where available datasets suffer from imperfections such as missing values, noise or artefacts. Contributions in applied sciences include medical applications, epidemic management tools, methodological work, and industrial applications, among others. We believe that this special issue will bring new ideas for solving this challenging problem, and will provide clear examples of application in real-world scenarios.Fil: Caiafa, César Federico. Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Instituto Argentino de Radioastronomía. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto Argentino de Radioastronomía; ArgentinaFil: Zhe, Sun. Lab. Adaptive Intelligence - Riken; JapónFil: Tanaka, Toshihisa. Tokyo University of Agriculture and Technology; JapónFil: Marti Puig, Pere. University of Vic; EspañaFil: Solé Casals, Jordi. University of Vic; Españ

    Automatic ham classification method based on support vector machine model increases accuracy and benefits compared to manual classification

    Get PDF
    The thickness of the subcutaneous fat (SFT) is a very important parameter in the ham, since determines the process the ham will be submitted. This study compares two methods to predict the SFT in slaughter line: an automatic system using an SVM model (Support Vector Machine) and a manual measurement of the fat carried out by an experienced operator, in terms of accuracy and economic benefit. These two methods were compared to the golden standard obtained by measuring SFT with a ruler in a sample of 400 hams equally distributed within each SFT class. The results show that the SFT prediction made by the SVM model achieves an accuracy of 75.3%, which represents an improvement of 5.5% compared to the manual measurement. Regarding economic benefits, SVM model can increase them between 12 and 17%. It can be concluded that the classification using SVM is more accurate than the one performed manually with an increase of the economic benefit for sorting.info:eu-repo/semantics/acceptedVersio

    Efficient cubic spline interpolation implemented with FIR filters

    Get PDF
    Classical Cubic spline interpolation needs to solve a set of equations of high dimension. In this work we show how to compute the interpolant using a FIR digital filter, with a reduced number of operations per interpolated point and high accuracy. Additionally, the computation can be made on real time as the signal samples are acquired. Following this approach, we show how to obtain easily the derivatives of the interpolant in a similar way, and also signal approximations to reduce the oscillations that appear when using high order splines. These techniques are very well suited to compute continuous representations of image contours on closed shapes and to find its curvature and singularities.Peer ReviewedPostprint (published version

    Serial-EMD: Fast Empirical Mode Decomposition Method for Multi-dimensional Signals Based on Serialization

    Get PDF
    Empirical mode decomposition (EMD) has developed into a prominent tool for adaptive, scale-based signal analysis in various fields like robotics, security and biomedical engineering. Since the dramatic increase in amount of data puts forward higher requirements for the capability of real-time signal analysis, it is difficult for existing EMD and its variants to trade off the growth of data dimension and the speed of signal analysis. In order to decompose multi-dimensional signals at a faster speed, we present a novel signal-serialization method (serial-EMD), which concatenates multi-variate or multi-dimensional signals into a one-dimensional signal and uses various one-dimensional EMD algorithms to decompose it. To verify the effects of the proposed method, synthetic multi-variate time series, artificial 2D images with various textures and real-world facial images are tested. Compared with existing multi-EMD algorithms, the decomposition time becomes significantly reduced. In addition, the results of facial recognition with Intrinsic Mode Functions (IMFs) extracted using our method can achieve a higher accuracy than those obtained by existing multi-EMD algorithms, which demonstrates the superior performance of our method in terms of the quality of IMFs. Furthermore, this method can provide a new perspective to optimize the existing EMD algorithms, that is, transforming the structure of the input signal rather than being constrained by developing envelope computation techniques or signal decomposition methods. In summary, the study suggests that the serial-EMD technique is a highly competitive and fast alternative for multi-dimensional signal analysis.Fil: Zhang, Jin. Nankai University; ChinaFil: Feng, Fan. Nankai University; ChinaFil: Marti Puig, Pere. Central University of Catalonia; EspañaFil: Caiafa, César Federico. Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Instituto Argentino de Radioastronomía. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto Argentino de Radioastronomía; ArgentinaFil: Sun, Zhe. RIKEN; JapónFil: Duan, Feng. Nankai University; ChinaFil: Sole Casals, Jordi. Central University of Catalonia; Españ

    Detection of Wind Turbine Failures through Cross-Information between Neighbouring Turbines

    Get PDF
    In this paper, the time variation of signals from several SCADA systems of geographically closed turbines are analysed and compared. When operating correctly, they show a clear pattern of joint variation. However, the presence of a failure in one of the turbines causes the signals from the faulty turbine to decouple from the pattern. From this information, SCADA data is used to determine, firstly, how to derive reference signals describing this pattern and, secondly, to compare the evolution of different turbines with respect to this joint variation. This makes it possible to determine whether the behaviour of the assembly is correct, because they maintain the well-functioning patterns, or whether they are decoupled. The presented strategy is very effective and can provide important support for decision making in turbine maintenance and, in the near future, to improve the classification of signals for training supervised normality models. In addition to being a very effective system, it is a low computational cost strategy, which can add great value to the SCADA data systems present in wind farms.Peer ReviewedObjectius de Desenvolupament Sostenible::7 - Energia Assequible i No Contaminant::7.a - Per a 2030, augmentar la cooperació internacional per tal de facilitar l’accés a la investigació i a les tecnolo­gies energètiques no contaminants, incloses les fonts d’energia renovables, l’eficiència energètica i les tecnologies de combustibles fòssils avançades i menys contaminants, i promoure la inversió en infraestructures energètiques i tecnologies d’energia no contaminantObjectius de Desenvolupament Sostenible::7 - Energia Assequible i No Contaminant::7.b - Per a 2030, ampliar la infraestructura i millorar la tecnologia per tal d’oferir serveis d’energia moderns i sos­tenibles per a tots els països en desenvolupament, en particular els països menys avançats, els petits estats insulars en desenvolupament i els països en desenvolupament sense litoral, d’acord amb els programes de suport respectiusObjectius de Desenvolupament Sostenible::7 - Energia Assequible i No ContaminantPostprint (published version

    A fast approach to removing muscle artifacts for EEG with signal serialization based Ensemble Empirical Mode Decomposition

    Get PDF
    An electroencephalogram (EEG) is an electrophysiological signal reflecting the functional state of the brain. As the control signal of the brain-computer interface (BCI), EEG may build a bridge between humans and computers to improve the life quality for patients with movement disorders. The collected EEG signals are extremely susceptible to the contamination of electromyography (EMG) artifacts, affecting their original characteristics. Therefore, EEG denoising is an essential preprocessing step in any BCI system. Previous studies have confirmed that the combination of ensemble empirical mode decomposition (EEMD) and canonical correlation analysis (CCA) can effectively suppress EMG artifacts. However, the time-consuming iterative process of EEMD limits the application of the EEMD-CCA method in real-time monitoring of BCI. Compared with the existing EEMD, the recently proposed signal serialization based EEMD (sEEMD) is a good choice to provide effective signal analysis and fast mode decomposition. In this study, an EMG denoising method based on sEEMD and CCA is discussed. All of the analyses are carried out on semi-simulated data. The results show that, in terms of frequency and amplitude, the intrinsic mode functions (IMFs) decomposed by sEEMD are consistent with the IMFs obtained by EEMD. There is no significant difference in the ability to separate EMG artifacts from EEG signals between the sEEMD-CCA method and the EEMD-CCA method (p > 0.05). Even in the case of heavy contamination (signal-to-noise ratio is less than 2 dB), the relative root mean squared error is about 0.3, and the average correlation coefficient remains above 0.9. The running speed of the sEEMD-CCA method to remove EMG artifacts is significantly improved in comparison with that of EEMD-CCA method (p < 0.05). The running time of the sEEMD-CCA method for three lengths of semi-simulated data is shortened by more than 50%. This indicates that sEEMD-CCA is a promising tool for EMG artifact removal in real-time BCI systems.Fil: Dai, Yangyang. Nankai University; ChinaFil: Duan, Feng. Nankai University; ChinaFil: Feng, Fan. Nankai University; ChinaFil: Sun, Zhe. RIKEN; JapónFil: Zhang, Yu. Lehigh University Bethlehem; Estados UnidosFil: Caiafa, César Federico. Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Instituto Argentino de Radioastronomía. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto Argentino de Radioastronomía; ArgentinaFil: Marti Puig, Pere. Central University of Catalonia; EspañaFil: Solé Casals, Jordi. Central University of Catalonia; Españ

    Machine Learning Methods with Noisy, Incomplete or Small Datasets

    Get PDF
    In this article, we present a collection of fifteen novel contributions on machine learning methods with low-quality or imperfect datasets, which were accepted for publication in the special issue “Machine Learning Methods with Noisy, Incomplete or Small Datasets”, Applied Sciences (ISSN 2076-3417). These papers provide a variety of novel approaches to real-world machine learning problems where available datasets suffer from imperfections such as missing values, noise or artefacts. Contributions in applied sciences include medical applications, epidemic management tools, methodological work, and industrial applications, among others. We believe that this special issue will bring new ideas for solving this challenging problem, and will provide clear examples of application in real-world scenarios.Instituto Argentino de Radioastronomí

    On-line Ham Grading using pattern recognition models based on available data in commercial pig slaughterhouses

    Get PDF
    The thickness of the subcutaneous fat in hams is one of the most important factors for the dry-curing process and largely determines its final quality. This parameter is usually measured in slaughterhouses by a manual metrical measure to classify hams. The aim of the present study was to propose an automatic classification method based on data obtained from a carcass automatic classification equipment (AutoFom) and intrinsic data of the pigs (sex, breed, and weight) to simulate the manual classification system. The evaluated classification algorithms were decision tree, support vector machines (SVM), k-nearest neighbour and discriminant analysis. A total of 4000 hams selected by breed and sex were classified as thin (0–10 mm), standard (11–15 mm), semi-fat (16–20 mm) and fat (>20 mm). The most reliable model, with a percentage of success of 73%, was SVM with Gaussian kernel, including all data available. These results suggest that the proposed classification method can be a useful online tool in slaughterhouses to classify hams.info:eu-repo/semantics/acceptedVersio
    corecore